A new adaptive long-term spectral estimation voice activity detector

نویسندگان

  • Javier Ramírez
  • José C. Segura
  • M. Carmen Benítez
  • Ángel de la Torre
  • Antonio J. Rubio
چکیده

This paper shows an efficient voice activity detector (VAD) that is based on the estimation of the long-term spectral divergence (LTSD) between noise and speech periods. The proposed method decomposes the input signal into overlapped speech frames, uses a sliding window to compute the long-term spectral envelope and measures the speech/non-speech LTSD, thus yielding a high discriminating decision rule and minimizing the average number of decision errors. In order to increase nonspeech detection accuracy, the decision threshold is adapted to the measured noise energy while a controlled hang-over is activated only when the observed signal-to-noise ratio (SNR) is low. An exhaustive analysis of the proposed VAD is carried out using the AURORA TIdigits and SpeechDat-Car (SDC) databases. The proposed VAD is compared to the most commonly used ones in the field in terms of speech/non-speech detection and recognition performance. Experimental results demonstrate a sustained advantage over G.729, AMR and AFE VADs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

Efficient voice activity detection algorithms using long-term speech information

Currently, there are technology barriers inhibiting speech processing systems working under extreme noisy conditions. The emerging applications of speech technology, especially in the fields of wireless communications, digital hearing aids or speech recognition, are examples of such systems and often require a noise reduction technique operating in combination with a precise voice activity dete...

متن کامل

Spectral subtraction with full-wave rectification and likelihood controlled instantaneous noise estimation for robust speech recognition

In standard Spectral Subtraction (SS), Half-Wave Rectification SS (HWR-SS) is normally applied to avoid negative values in the Power Spectral Density (PSD) that occur mainly due to inaccurate noise estimation caused by a Voice Activity Detector (VAD). In this paper analyses show that, given accurate noise estimation, the phase relationship between speech and noise becomes the dominant cause of ...

متن کامل

Efficient voice activity detection algorithm using long-term spectral flatness measure

This paper proposes a novel and robust voice activity detection (VAD) algorithm utilizing long-term spectral flatness measure (LSFM) which is capable of working at 10 dB and lower signal-to-noise ratios(SNRs). This new LSFM-based VAD improves speech detection robustness in various noisy environments by employing a low-variance spectrum estimate and an adaptive threshold. The discriminative powe...

متن کامل

Approach for Energy-Based Voice Detector with Adaptive Scaling Factor

This paper presents an alternative energy-based algorithm to provide speech/silence classification. The algorithm is capable to track non-stationary signals and dynamically calculate instantaneous value for threshold using adaptive scaling parameter. It is based on the observation of a noise power estimation used for computation of the threshold can be obtained using minimum and maximum values ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003